Skip to content
This repository has been archived by the owner on Aug 30, 2024. It is now read-only.

[Model]enable glm4-9b #291

Merged
merged 9 commits into from
Jun 14, 2024
Merged

[Model]enable glm4-9b #291

merged 9 commits into from
Jun 14, 2024

Conversation

intellinjun
Copy link
Contributor

@intellinjun intellinjun commented Jun 13, 2024

Type of Change

feature or bug fix or documentation or others
API changed or not
not

Description

Enable glm-4-9b
detail description
q4j32cint8 result
image

Signed-off-by: intellinjun <[email protected]>
@intellinjun intellinjun requested review from zhentaoyu, Zhenzhong1 and a32543254 and removed request for zhentaoyu June 13, 2024 07:15
Copy link
Contributor

@a32543254 a32543254 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

neural_speed/__init__.py Outdated Show resolved Hide resolved
@zhentaoyu
Copy link
Contributor

please update its Q4 lambada_openai acc result when your PR ready since GLM has some special tokens

neural_speed/__init__.py Show resolved Hide resolved
neural_speed/convert/convert_chatglm.py Outdated Show resolved Hide resolved
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
@intellinjun
Copy link
Contributor Author

@a32543254 a32543254 merged commit ea20cc2 into main Jun 14, 2024
14 of 15 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants